Journal of Proteome Research — Latest Matching Preprints

1

Hidden Structural Bias in Proteomics: Sonication-induced Selective Fragmentation of Intrinsically Disordered Regions

Narita, M.; Yamakawa, T.; Nishimura, R.; Iwasaki, M.

2026-07-15 cell biology 10.64898/2026.07.14.738389 medRxiv

Top 0.1%

52.6%

Show abstract

Sonication is a fundamental technique in proteome sample preparation, primarily used for protein solubilization and shearing of genomic DNA. Although the mechanical shearing of DNA is well-characterized, its unintended impact on protein structural integrity remains a significant "blind spot" in high-throughput analytical workflows. In this study, we systematically investigated sonication-induced protein fragmentation by combining gel-based fractionation (PEPPI-MS) with sequence-level compositional analysis and bioinformatic mapping. Our results demonstrate that sonication does not significantly alter overall proteome identification or the recovery of membrane proteins; however, it induces extensive and non-random protein fragmentation. Sonication caused an approximately three-fold increase in the abundance of >45 kDa protein-derived fragments migrating into the <40 kDa fraction, and 1,620 high-molecular-weight (MW) proteins were uniquely detected in the lower-MW fraction upon sonication, an eight-fold increase over non-sonicated controls. Peptide-level amino acid composition analysis revealed subtle but directional shifts in the sonication-derived fragments. This residue-level signature is reinforced by two orthogonal structural analyses (MobiDB peptide-level mapping and protein-level profiling using metapredict V3 software), which show that sonication-susceptible proteins harbor more than twice the disordered content of length-matched controls (median 40% vs. 18%). This study identifies a previously unrecognized "structural bias" whereby intrinsically disordered region (IDR)-rich proteins are selectively compromised during sample preparation. Because these fragments are indistinguishable from enzymatic digestion products in conventional bottom-up proteomics, the underlying structural damage is effectively masked in global quantitative datasets, potentially distorting biological interpretations related to protein size, isoforms, and stability, particularly for IDR-rich classes, such as transcription factors and signaling molecules. We propose that optimizing and standardizing sonication parameters is essential for ensuring the accuracy and reproducibility of quantitative proteomic analyses.

2

Systematic optimization and benchmarking of synchro-PASEF for high-throughput phosphoproteome profiling

Brademan, D.; Mullarkey, A.; Greeson, M.; Szvetecz, S.; Vitek, O.; Blythe, E.; Huttenhain, R.

2026-06-27 biochemistry 10.64898/2026.06.26.734570 medRxiv

Top 0.1%

52.5%

Show abstract

High-throughput data-independent acquisition (DIA) workflows paired with short chromatographic separations are increasingly adopted for systems biology and clinical proteomics. However, narrower peak widths from rapid separations demand faster mass spectrometer cycle times to maintain quantitative depth and reproducibility. The synchro-PASEF acquisition mode on timsTOF mass spectrometers diagonally scans across ion mobility and m/z space, enabling efficient sampling of the precursor ion cloud with shortened cycle times. While synchro-PASEF has demonstrated competitive identification depth for global protein abundance samples compared to conventional dia-PASEF, its performance for phosphoproteomics - where the precursor ion cloud is characteristically broader and bimodally distributed - has not been evaluated. Here, we systematically optimized synchro-PASEF methods for phosphoproteomics and benchmarked performance against two dia-PASEF methods across three sub-hour separations. We found that synchro-PASEF performance depends critically on balancing diagonal window number, total isolation width, and gradient length, with longer gradients favoring more windows for selectivity and shorter gradients favoring fewer windows to preserve sampling frequency. An optimized configuration quantified over 19,000 localized phosphosites using a 23-minute separation. Retention time summation (RTsum) with a factor of 2 increased phosphopeptide identifications by 5-20% and reduced phosphosite-level coefficients of variation by up to 30% across all dia-PASEF and synchro-PASEF methods tested. Using {beta}2-adrenergic receptor (B2AR) activation as a signaling model, we demonstrate that label-free DIA phosphoproteomics can be used to model phosphoproteomics dose-response relationships, showing that synchro-PASEF and dia-PASEF produce highly concordant phosphoproteomic responses, with comparable numbers of responding phosphosites, similar effect sizes, and nearly identical predicted protein kinase A (PKA) substrates downstream of the activated B2AR. While synchro-PASEF did not surpass optimized dia-PASEF in identification depth, its comparable biological performance and amenability to post-acquisition optimization through RTsum support its utility for high-throughput phosphoproteomics. This work provides a transferable framework for synchro-PASEF method optimization and demonstrates the broad utility of retention time summation for PASEF-based phosphoproteomics workflows.

3

MassSpectrum Analyzer: An interactive platform for proteomic searching parameter refinement and peptide modification focused re-scoring

Karlic, K. I.; Scott, N. E.

2026-06-28 bioinformatics 10.64898/2026.06.22.733873 medRxiv

Top 0.1%

45.2%

Show abstract

Peptide spectrum annotation is critical for the assignment of peptides and the localisation of modifications. While many existing tools provide spectrum annotation capacities, they often lack the flexibility required to allow bespoke spectral annotation of peptides containing multiple labile modifications or the accurate assignment of peptides in which fragmentation deviates from canonical patterns. In these cases, user-guided annotation is widely used to improve assignment completeness, however it typically does not integrate peptide scoring, making it challenging to assess the empirical improvement of the associated annotation and its impact on downstream false-discovery rate estimations. Here, we introduce an interactive annotation environment, the 'MassSpectrum Analyzer', which aims to streamline the exploration and analysis of modified peptides by enabling user-defined customisation with peptide scoring. Using (2-Aminoethyl)trimethylammonium carboxyl-derivatised peptides and glycopeptides as case studies we demonstrate the capacity of the MassSpectrum Analyzer to rapidly explore and allow the assessment of modified peptide datasets. By enabling direct assessment of the impact of user-guided choices on peptide scoring, we show how the detection of highly modified peptides can be improved through post-search integration of modification fragmentation information in a statistically robust manner. Similarly, by permitting comparisons of peptide ion intensities across spectra, we show that global fragmentation patterns can be quantified allowing the interrogation of trends that only become clear when spectra are assessed en masse. Combined, the MassSpectrum Analyzer streamlines the generation of publication-ready spectra and provides a means to assess how the inclusion of annotated features influences assignment scores.

4

onsite: An Integrated Framework for Phosphosite Localization and False Localization Rate Estimation

Yue, Q.-X.; Wei, Z.; Dai, C.; Bai, M.; Perez-Riverol, Y.; Sachsenberg, T.

2026-07-11 bioinformatics 10.64898/2026.07.08.737157 medRxiv

Top 0.1%

45.2%

Show abstract

With the rapid development of mass spectrometry-based proteomics, the volume of phosphoproteomic data has increased substantially. However, accurate localization of phosphorylation sites and standardized statistical validation remain critical analytical bottlenecks. To address the lack of standardized cross-algorithm evaluation, we introduce onsite, a unified and open-source Python framework. onsite integrates an alanine-decoy strategy to estimate the false localization rate (FLR) across three algorithms: AScore, PhosphoRS, and pyLucXor. This modular architecture efficiently processes large-scale datasets and enables global FLR calculation. Benchmarking on the standard synthetic phosphopeptide dataset PXD000138 highlighted distinct inter-algorithmic variations. Using the same 5% global FLR threshold, pyLucXor localized the most target sites (28,353). It also reached a high accuracy (91.22%) against the known ground truth, resulting in the largest number of correctly localized sites (25,865). Reanalysis of the highly fractionated, large-scale PXD012255 dataset further demonstrated that native integration of onsite into the quantms pipeline enables scalable processing and provides a standardized framework for FLR control in large-scale phosphoproteomics. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=64 SRC="FIGDIR/small/737157v1_ufig1.gif" ALT="Figure 1"> View larger version (14K): org.highwire.dtl.DTLVardef@e4c85dorg.highwire.dtl.DTLVardef@1e8464org.highwire.dtl.DTLVardef@185cea1org.highwire.dtl.DTLVardef@1c0d1bc_HPS_FORMAT_FIGEXP M_FIG C_FIG

5

ProtPen combines sequence- and structure-based approaches to facilitate protein function predictions on a proteome-wide scale

Mathai, D.; Schulze, S.

2026-07-11 bioinformatics 10.64898/2026.07.11.737882 medRxiv

Top 0.1%

38.4%

Show abstract

Proteins of unknown function represent a significant gap in our understanding of biological processes, encompassing large portions of the proteomes of many organisms, especially prokaryotes. Addressing this gap is critical to understanding the biology and pathogenicity of such organisms. We introduce ProtPen, an open-source pipeline that facilitates protein function prediction by combining eggNOG-mapper for sequence-based annotation with Foldseek for rapid structural similarity searches using AlphaFold-predicted protein structures. Annotation results from both tools are merged and enriched with UniProt metadata to produce a comprehensive output suitable for downstream analysis. The pipeline requires only a FASTA input file with UniProt identifiers, and is designed to analyze datasets on the scale of whole proteomes. Benchmarking on a curated dataset of well-characterized Pseudomonas aeruginosa proteins demonstrated an annotation accuracy of >90%, and highlighted the complementarity of sequence- and structure-based methods. Further evaluation of ProtPen included its application to biologically relevant datasets, comprising proteins of unknown function that exhibited significant differential abundances in a proteomics dataset of P. aeruginosa, and uncharacterized glycoproteins from Haloferax volcanii. ProtPen is readily extensible to incorporate additional protein function prediction tools. In summary, this pipeline facilitates the systemwide annotation of proteins of unknown function from proteomic datasets and whole proteomes. For Table of Contents Only O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=98 SRC="FIGDIR/small/737882v1_ufig1.gif" ALT="Figure 1"> View larger version (25K): org.highwire.dtl.DTLVardef@1011179org.highwire.dtl.DTLVardef@1222493org.highwire.dtl.DTLVardef@8f69f2org.highwire.dtl.DTLVardef@174b30e_HPS_FORMAT_FIGEXP M_FIG C_FIG

6

Kinetic Lipidomics: Quantifying in vivo changes in lipid metabolism using metabolic labeling

Nielsen, C.; Denton, R.; Driggs, B.; Gates, S.; Hilton, T.; Naylor, B.; Quilling, C.; Virgin, K.; Cutler, K.; Sorensen, M.; Poulson, M.; Snedaker, P.; Hernandez, Z.; Transtrum, M.; Price, J. C.

2026-07-01 biochemistry 10.64898/2026.06.29.735310 medRxiv

Top 0.1%

34.3%

Show abstract

Lipid metabolism reflects the dynamic balance between metabolic turnover and concentration. Kinetic mass spectrometry (MS) enables direct quantification of molecular turnover in vivo. Previous work has shown that MS-based kinetic proteomics has provided powerful insights into proteome regulation. Analogous lipidome-wide kinetic measurements remain limited by challenges in defining molecule-specific labeling behavior. Here, we extend kinetic MS to untargeted lipidomics. Isotope labeling with deuterated water (2H2O) is commonly used for monitoring turnover of palmitate and other select lipids by measuring labeling of stable CH positions with deuterium (2H). Here, we extend the deuterium-incorporation model underlying these targeted lipid turnover assays to support untargeted analysis of all detectable lipids. This allows us to empirically quantify the effective fraction of endogenous synthesis (Asyn) and the turnover rate (k) across hundreds of lipid species simultaneously. One central barrier to lipidome-wide kinetic modeling is determining the endogenous number of deuterium-labeling sites for each molecule (nL) which is required to estimate Asyn and k accurately. The nL value is an essential component of biological kinetic assays. In kinetic proteomics, curated amino acid nL libraries enable peptide-level modeling by summing sequence-specific labeling-site values, but comparable resources are lacking for lipids and may not generalize across metabolic states or non-mammalian systems. Yet, gaps remain for lipids and for amino acids in modified metabolic conditions or non-mammalian biologies. Here, we empirically determine lipid nL values and validate the process with peptides against an nL library. To evaluate this strategy in a biologically relevant setting, we applied it to brain tissue from transgenic mice expressing human ApoE isoforms, where altered lipid transport and metabolism are implicated in Alzheimers disease risk. These data validate the method in a clinically relevant context and suggest that genotype-dependent metabolism can alter empirically determined lipid nL values.

7

Enhanced proteome relative quantification using refined quantotypic spectral libraries

Barnes, B. A.; Alharbi, H.; Unwin, R.

2026-07-10 bioinformatics 10.64898/2026.07.06.736793 medRxiv

Top 0.1%

33.3%

Show abstract

Plasma proteomics is used for a variety of applications including biomarker discovery, disease monitoring, and drug development. Data-independent acquisition (DIA) has vastly improved the breadth of proteins that are identified from samples; however, given challenges in reproducibility and translation, it is critical that the quantitative performance of these methods is reliable. Analysis of global proteomics data typically incorporates information from all detected peptides. However, some peptides do not reflect their parent protein amount, due to irreproducible digestion, modification, analytical interferences or instability. We hypothesise that including these peptides impacts protein relative quantification, and thus, a refined spectral library containing only quantitatively representative peptides provides superior protein quantification. By analysing a defined multi-species spike-in model, we show that refining a plasma spectral library by removing precursors that fail to meet quality control metrics (25.4% of all identified precursors) reduces noise and variability, improving precision, accuracy and differential abundance analysis by up to [~]11%, with minimal identification losses and substantial reduction in computational demand. This demonstrates proof-of-concept that refining spectral libraries produces results that prioritize quantification quality over quantity. This approach could enable development of universal tissue-specific refined spectral libraries able to improve quantification quality with easy implementation and minimal processing time. Significance of the StudyAs DIA mass spectrometry proteome depth increases, the quality of the associated protein quantifications must be considered alongside identification breadth, particularly in complex matrices such as plasma, which presents additional technical challenges. The spectral library used for protein identification and quantification is a critical determinant of DIA performance, and its composition requires considerable consideration. This work illustrates an initial step toward improving protein quantification starting at the spectral library level by filtering precursors which are poor quantitative representatives of their parent proteins. In doing so, the resulting data is more reliable for downstream and biological interpretation, with fewer false differential abundance assignments and reduced quantitative noise. As such, this work represents a broader shift away from the habitual focus of MS workflows on maximising the number of protein and differential abundance identifications and instead prioritises the quality of quantification over quantity. These initial findings lay the groundwork for further development of spectral library refinement strategies, with the potential to continue improving the accuracy and precision of protein quantification in DIA-based proteomics.

8

Selective knockout of PKA regulatory subunits reveal opposite catalytic and metabolic consequences with implications for Alzheimer's disease

Rossitto, L.-A. M.; Lu, T.; Ma, Y.; Kaila Sharma, P.; Burghi, V.; Gonzalez, C. C.; Bruystens, J.; Maurya, S.; Wu, J.; Lona, A.; Kufareva, I.; Gutkind, J. S.; Gonzalez, D. J.; Chen, X.; Taylor, S. S.

2026-06-29 biochemistry 10.64898/2026.06.26.734839 medRxiv

Top 0.1%

27.0%

Show abstract

cAMP-dependent Protein Kinase A (PKA) is a master regulator of cell signaling involved in energy metabolism, synaptic plasticity, and stress response. Dysregulated PKA signaling is implicated in diseases including neurodegeneration and cancer. PKA catalytic activity is regulated by two nonredundant regulatory subunits, Type I (RI/RI{beta}) and Type II (RII/RII{beta}), whose divergent functions are not fully understood. We generated double-knockout (KO) cell lines of RI/RI{beta} and RII/RII{beta} subunits and performed multiplexed MS-based proteomic and phosphoproteomic profiling under basal and glucose-perturbed conditions. We found that RI and RII loss drives distinct, and often opposite, remodeling of the cellular proteome and phosphoproteome. While both mutants blunted metabolic flexibility to glycolytic stressors and stimuli, RI and RII KO cells exhibited elevated and depressed glycolytic signaling, respectively. Interestingly, RI KO increased the abundance and kinase activity of the PKA catalytic subunit C isoform, leading to an increase in PKA substrate phosphorylation, whereas RII KO decreased the abundance, kinase activity, and substrate phosphorylation by the catalytic subunit C{beta} isoform. Notably, one of the most differentially affected PKA sites between RI and RII KOs maps to Tau, whose hyperphosphorylation is a hallmark of Alzheimers disease. Loss of RI increased Tau phosphorylation, which was not only caused by increased PKA catalytic activity, but also a higher binding affinity of Tau to RII subunits on the negatively-charged flexible linker region. Overall, the present study demonstrates that PKA RI and RII subunits play nonredundant roles in modulating PKA activity, metabolic flexibility, and phospho-regulation of key disease-associated substrates such as Tau.

9

RNabel-A Standalone Software Tool for Annotating Tandem Mass Spectra of Modified Ribonucleic Acids

Song, G.; Du, Y.-J. N.; Sun, R.; Dong, M.-Q.

2026-06-24 bioinformatics 10.64898/2026.06.22.733900 medRxiv

Top 0.2%

23.7%

Show abstract

Ribonucleic acid (RNA) modifications, with over 170 identified types, play diverse roles in cellular processes. The past decade has witnessed surging demand for accurate identification and localization of RNA modifications in both endogenous and synthetic therapeutic RNAs. With accurate spectral annotation for RNA, tandem mass spectrometry (MS/MS) can meet this demand. Here we present RNabel, a user-friendly software tool for in-depth annotation of MS/MS spectra of RNA oligonucleotides. RNabel considers a full set of backbone-cleavage ions (a, b, c, d, a-B, w, x, y, z) in which the ribonucleotide unit could be A, U, C, G, Y (pseudouridine), or I (Inosine). Additionally, RNabel considers 196 modifications on the base, the phosphoribose linkage, the 5' or the 3' terminus, or detachment of a sub-nucleotide fragment as a neutral or charged group. Users can create new components if needed, including ribonucleotides, modifications, neutral or charged groups that could detach from a ribonucleotide. RNabel efficiently processes large datasets in four acceptable formats including .mgf, .raw, .txt from msConvert, and RNabel batch files. Multiple statistical metrics are provided for quality assessment of spectral annotation. To accelerate RNA modification analysis, RNabel is made freely available for Mac and Windows users at https://github.com/songge1111/RNabel/releases. Graphic Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=116 SRC="FIGDIR/small/733900v1_ufig1.gif" ALT="Figure 1"> View larger version (30K): org.highwire.dtl.DTLVardef@8ccae5org.highwire.dtl.DTLVardef@15c8cfaorg.highwire.dtl.DTLVardef@12b93a2org.highwire.dtl.DTLVardef@1e9aab9_HPS_FORMAT_FIGEXP M_FIG C_FIG

10

Integrative Proteomic Analysis Implicates Inhibition of Intracellular Protein Trafficking in Therapy-Induced Migrastasis in Prostate Cancer

Chen, W.; Rashidi, S.; Law, H. C.- H.; Qiao, F.; Zigmond, J. W.; ONeill, K. L.; Woods, N. T.; Guda, C.; Bergan, R.

2026-07-10 cancer biology 10.64898/2026.07.02.736165 medRxiv

Top 0.2%

22.5%

Show abstract

BackgroundDysregulated cell migration leading to metastasis remains the primary cause of cancer-related mortality. It has been challenging to understand how cells regulate migration. We have previously created the first selective inhibitor of cell migration, KBU2046. Here, we use it as a probe to identify regulatory processes. MethodsMetastatic and primary human prostate cancer cells were treated for different times and at different concentrations with KBU2046. Immunofluorescent microscopy examined protein localization in cells. Label-free mass spectrometry (MS) was performed on total cell proteins, Tandem Mass Tag (TMT) labeling MS was used on membrane fractions, and temporal phosphoproteomic profiling. Results were analyzed with a suite of bioinformatic tools. ResultsKBU2046-induced migrastasis is associated with the accumulation of activated integrin {beta}1 into focal adhesions. Whole-cell proteomics demonstrated suppression of processes that mediate intracellular protein trafficking and increases in mitochondrial energy-generation signatures. Evaluation of the membrane fraction identified increases in membrane repair and maintenance processes and decreases in those that drive motility. Temporal- and concentration-dependent phosphoproteomic profiling revealed that KBU2046 initiates a dynamic, cascading sequence of transient signaling waves rather than a static block. ConclusionsKBU2046-induced migrastasis appears to operate through spatial decoupling rather than structural degradation. By restricting the intracellular trafficking machinery required for receptor recycling, KBU2046 limits focal adhesion turnover, providing a correlative framework to inhibit metastatic dissemination independent of direct cytotoxicity. O_FIG O_LINKSMALLFIG WIDTH=122 HEIGHT=200 SRC="FIGDIR/small/736165v1_ufig1.gif" ALT="Figure 1"> View larger version (39K): org.highwire.dtl.DTLVardef@1cc69d3org.highwire.dtl.DTLVardef@137b843org.highwire.dtl.DTLVardef@1225e50org.highwire.dtl.DTLVardef@15dd8d2_HPS_FORMAT_FIGEXP M_FIG Graphic Abstract C_FIG

11

Protein Aggregation Capture for Top-down Proteomics

Feltenstein, I. G.; Drown, B. S.

2026-07-03 biochemistry 10.64898/2026.07.02.736076 medRxiv

Top 0.2%

22.2%

Show abstract

Proteins are dynamically regulated by a myriad of post-translational modifications (PTMs) that control their stability, conformation, activity, subcellular localization, and local interactions. Capturing the precise composition of these various modification states, or proteoforms, is a principal objective of top-down proteomics (TDP). By ionizing intact proteoforms and combining measurements of precursor ion and fragment ion masses, the position, stoichiometry, and combination of PTMs can be determined. Despite the highly valuable measurements that TDP can provide, it is typically less sensitive than corresponding peptide-level analysis with many reports utilizing input material in the microgram to milligram range. Contributing to this lack of sensitivity is the risk of sample loss due to non-specific binding to surfaces during sample preparation. The most widely employed sample preparation approaches for TDP either require high sample input (e.g. precipitation and ultra-filtration) or fail to effectively remove surfactants (e.g. solid-phase extraction). These limitations have hindered advancement of targeted TDP applications involving immunoprecipitation and other enrichment strategies. Bead-assisted protein aggregation, also referred to as single-pot, solid-phase-enhanced sample preparation (SP3), has emerged as a popular sample preparation strategy for bottom-up proteomic workflows, but has only been used in TDP with secondary ion exchange chromatography cleanup. We envisioned a magnetic bead based protein cleanup approach that proceeds directly to MS analysis with judicious choice of bead surface chemistry and elution conditions. Here we report a sample preparation method using hydroxyl-functionalized magnetic beads for top-down proteomics applications.

12

Learning Fragmentation Physics or Exploiting Sequence Priors? Benchmarking Bias in Deep Learning Models for De Novo Peptide Sequencing

Li, J.; Rost, H.

2026-06-29 bioinformatics 10.64898/2026.06.23.734131 medRxiv

Top 0.2%

19.4%

Show abstract

Deep learning models have advanced de novo peptide sequencing, but their predictions may reflect both physics-based spectral evidence and learned peptide-sequence priors. Systematically measuring such prior-associated behavior is important for benchmarking model robustness beyond conventional proteomics data. Here, we introduce the Prior Bias Index (PBI), a general framework for measuring the extent to which model behavior shifts toward prior-associated reference patterns under controlled conditions, and implement it as DeNovo-PBI, a benchmark for quantifying prior bias in de novo peptide sequencing models. DeNovo-PBI combines benchmark dataset construction, in silico sequence and spectral perturbation workflows, PBI-based metrics, and analysis algorithms to evaluate three forms of prior-associated behavior: sequence-distribution dependence, database amino-acid-pair order preference, and mutation-group prediction consistency under shared sequence context. In addition to experimentally acquired peptide spectra, we generated in silico spectra from random, natural, and mutated peptide sequences and selectively removed fragment ions that distinguish N-terminal residue orders. Across these assays, deep learning models showed peptide-sequence-distribution-dependent performance and strong directional amino-acid-pair order preferences even when order-diagnostic spectral evidence was removed. DeNovo-PBI provides a quantitative benchmark for measuring, comparing, and interpreting learned bias in de novo peptide sequencing models.

13

Sortase-mediated enrichment of ubiquitinated proteins from complex samples

Raniszewski, N.; Beckley, K.; Hintzen, J.; Noel, M.; Burslem, G.

2026-07-01 biochemistry 10.64898/2026.06.29.735432 medRxiv

Top 0.2%

18.9%

Show abstract

Despite its importance in cellular signaling and protein fate, the detection of protein ubiquitination in proteomics experiments presents many challenges for researchers. Importantly, current techniques that often rely on antibodies specific for lysine sidechain modifications may miss non-canonical ubiquitination sites in experiments. We envisioned a strategy that uses sortase, a bacterial transpeptidase enzyme, to selectively modify ubiquitination sites with a Biotin tag for enrichment and downstream proteomics experiments. In this work, we demonstrate our ability to selectively modify N-terminal diglycine remnants in digested proteins with a Biotin-modified peptide, enabling downstream enrichment of previously ubiquitinated proteins. We show this proof of concept on several recombinant proteins, revealing a site of autoubiquitination in the E2 conjugating enzyme Ubc13. We show that elution of the enriched peptides can be achieved by using common guanidinium elutions or by leveraging the reversibility of sortase. Finally, we include a bifunctional peptide that is labile to trypsinization to better streamline this strategy for downstream proteomics approaches. We envision that this approach will provide an accessible strategy for the detection of ubiquitinated proteins in proteomics experiments, with the goal of enabling researchers to better detect noncanonical protein ubiquitination.

14

Data Independent Acquisition Pipeline for Microbiome Samples (Microbe-DIA)

Obermiller, S. A.; Lipton, M. S.; Piehowski, P. D.; Bilbao, A.; McCue, L. A.; Prozapas, V. N.; Attah, I. K.

2026-07-14 microbiology 10.64898/2026.07.13.738261 medRxiv

Top 0.2%

18.7%

Show abstract

The functional complexity inherent in microbiomes complicates analytical approaches aimed at defining phenotype. As proteins are the functional effectors of microbiome phenotypes, improving the performance of mass spectrometry-based metaproteomics is critical to achieving the functional characterization of these systems. Data-independent acquisition (DIA) improves protein coverage and reduces data missingness when compared to data-dependent acquisition (DDA) in metaproteomics. However, the application of DIA to complex microbial systems remains constrained by analytical throughput and computational scalability. Here, we optimized LC-MS/MS acquisition parameters for both DDA and DIA using a model microbiome, demonstrating how DIA enables increased sample throughput without compromising quantitative performance. In addition, we demonstrated a computationally efficient, library-free DIA workflow that overcomes reliance on empirical spectral libraries. Our analytical and computational innovations establish a scalable and cost-effective pipeline for metaproteomics of complex microbial communities.

15

Development of an Ethylenediaminetetraacetic Acid-Enhanced Deep Proteomic Profiling Method for Dried Blood Spots and Its Application in Mouse Disease Models

Nakajima, D.; Kanno, T.; Okuda, Y.; Mitsui, H.; Konno, R.; Ueyama, N.; Endo, Y.; Ohara, O.; Kawashima, Y.

2026-07-14 molecular biology 10.64898/2026.07.13.738354 medRxiv

Top 0.2%

18.7%

Show abstract

Dried blood spots (DBS) are well-established microsamples used in clinical testing and newborn screening. However, their use in deep proteomics is hindered by highly abundant blood proteins and inefficient protein recovery from filter paper matrices. The non-targeted analysis of non-specifically DBS-absorbed proteins (NANDA) workflow partially overcomes the impact of abundant blood proteins and has enabled the identification of over 5,000 proteins from DBS samples. Nonetheless, residual abundant proteins, including hemoglobin and fibrinogen, constrain deep proteomic analysis. Therefore, this study aimed to evaluate the effects of the metal chelator ethylenediaminetetraacetic acid (EDTA) on the depth of DBS proteomic analysis. An optimized EDTA-enhanced NANDA protocol that incorporated a 100 mM EDTA wash step was compatible with standard DBS collection procedures and required no modification of current clinical workflows, markedly enhancing the depletion of abundant proteins and facilitating its potential use in clinical and translational settings. When combined with Orbitrap Astral data-independent acquisition mass spectrometry, this approach enabled the single-shot identification of more than 7,000 proteins from DBS samples; to the best of our knowledge, this represents the deepest proteome coverage reported to date, and the workflow further supported high-throughput and highly reproducible analyses. Additionally, its application to mouse disease models revealed disease-specific systemic immune signatures from minimal blood volumes. Collectively, these results establish EDTA-enhanced NANDA as a practical and scalable workflow that overcomes longstanding limitations of DBS proteomics, thereby enabling deep, high-throughput, minimally invasive proteomic profiling across diverse biological and experimental contexts.

16

ProteoDUDes: Taxonomic profiling for metaproteomics with false positive reduction

Schiebenhoefer, H.; Muth, T.; Fuchs, S.; Renard, B. Y.

2026-07-02 bioinformatics 10.64898/2026.06.29.734936 medRxiv

Top 0.3%

15.9%

Show abstract

Metaproteomics is the investigation of the protein composition of multi-organism samples. While metagenomics answers the question which organisms are present in a sample, metaproteomics additionally answers the question which organisms are active. State-of-the-art tools for annotating proteomic data with taxonomic information (e.g. Unipept, DIAMOND) do not control the false taxonomic identification rate, which can lead to incorrect results and thus incorrect interpretations, as we demonstrate with examples. ProteoDUDes processes the results from popular sequence annotation tools so that the proportion of true identifications in the result is at as high as or higher than in the the compared tools. We evaluate ProteoDUDes on simulated data and experimental mock community data. Our results indicate that ProteoDUDes has the same error rate as other tools on simulated data and half the error rate on the experimental mock community data. This allows more accurate statements to be made about which organisms are functionally active in a complex sample. ProteoDUDes is open-source and available at https://github.com/pirovc/dudes.

17

Contextualised real-time mass spectrometry improves glycosylation detection and characterisation

Kelly, M. I.; Ashwood, C.

2026-07-03 biochemistry 10.64898/2026.07.03.736344 medRxiv

Top 0.3%

15.5%

Show abstract

Glycosylation is a structurally diverse, non-template-driven modification whose analysis by liquid chromatography-mass spectrometry is constrained by discovery-mode acquisition rules developed for proteomics. Data-dependent acquisition filters, such as intensity-based precursor selection and charge-state exclusion, map poorly onto glycan analysis, which span wide ranges of charge state and abundance independent of their biological importance. Here we present glycosylation real-time mass spectrometry (GlycoRTMS), an instrument-API method that annotates observed precursor masses with glycan compositions in real time and uses this context to guide fragmentation. Composition-aware precursor prioritisation sampled deeper into the precursor space, expanding MS2 coverage of a hyaluronic acid hydrolysate from four to eight oligosaccharide subunits. Charge-state-specific collision energy equations tailored to oligosaccharides produced complete fragment ladders where fixed normalised collision energy did not. MS3 triggering gated by both diagnostic ions and glycan composition matching enabled efficient, chromatography-compatible characterisation of O-acetylated sialic acids and identified product ions specific to O-acetylation. Together, these strategies improve both the depth and quality of glycan detection and characterisation within a single injection.

18

Defining Quality Control Standards for Single-Cell Proteomics by Inter-Laboratory Benchmarking

van Puyenbroeck, S.; Claeys, T.; Seth, A.; Rijal, J.-B.; Keller, C.; Lin, L.; Mayer, R.; Matzinger, M.; Han, I.; Aragon Fernandez, P.; Petrosius, V.; Boyle, B.; Rivera, K.; Tourniaire, G.; Rosenberger, F. A.; Martens, L.; Carr, S. A.; Dong, Z.; Vegvari, A.; Carapito, C.; Kelly, R.; Mechtler, K.; Budnik, B.; Schoof, E. M.; Ctortecka, C.

2026-07-14 biochemistry 10.64898/2026.07.13.738155 medRxiv

Top 0.3%

15.5%

Show abstract

Single-cell proteomics can quantify thousands of proteins from individual mammalian cells, yet the absence of community-wide quality control limits biological interpretability. Here, the HUPO Single Cell Initiative presents the first inter-laboratory single-cell proteomics benchmarking study across seven laboratories using standardized 384-well plates acquired on Orbitrap Astral and timsTOF Ultra2 instruments. Centralized analysis across six DIA software tools revealed that software choice impacts identification depth and quantitative accuracy more than instrument vendor. Multi-layered quality control enabled the detection of cell-leakage during sorting, LC misconfiguration, column degradation and site-specific pipetting failures. Inter-lab quantitative correlations were strongest between instruments of the same vendor relative to cross-platform comparisons. Sequential correction for plate identity and well position recovered clean cell-type separation for confident downstream differential expression analysis. This study provides a data-driven quality control framework spanning plate design to batch correction for reproducible single-cell proteomics across laboratories and platforms.

19

RegulomeXplorer: Interactive exploration of drug effects on subcellularly resolved proteomes

Uiberacker, M.; Iellici, T.; Afanaseva, E.; Meier-Menches, S.; Zanghellini, J.

2026-07-03 bioinformatics 10.64898/2026.06.29.735319 medRxiv

Top 0.3%

15.4%

Show abstract

Mass spectrometry-based proteomics allows the quantification of drug-induced changes in protein abundance. However, the integration of perturbation data across subcellular compartments remains a challenging bottleneck. Here, we present RegulomeXplorer, a web-based tool for automated processing and interactive exploration of subcellular compartment-resolved proteomics data. RegulomeXplorer employs MaxQuant output files to determine differential protein regulations upon drug perturbation, performs functional enrichment analysis, and visualizes enriched terms on a two-dimensional cytoplasmic-nuclear plane, called regulome. The data visualization by means of regulomes allows to simultaneously assess the magnitude of drug perturbation effects within separate subcellular compartments as well as the contribution of regulated proteins to the position of each enriched term in the regulome plane. We validated RegulomeXplorer against previously published, manually curated regulome analyses. It was then applied on subcellular compartment resolved breast cancer cell line proteomes, revealing drug- and cell-line-specific responses to Doxorubicin and Taxol, both in line with their described mode of action. RegulomeXplorer provides an accessible workflow for interpreting compartment-resolved perturbation proteomics and generating mode of action hypotheses in drug-response studies. RegulomeXplorer is freely available without registration at https://chemnettools.anc.univie.ac.at/RegulomeExplorer/.

20

In Vivo Quantification of Histone Acetylation Turnover and Acetyl-CoA Sources Using 2H2O Metabolic Labeling and High-Resolution Mass Spectrometry.

Arias-Alvarado, A.; Sabir, U.; Ilchenko, S.; Parrish, S.; Aghayev, M.; He, W.; Tsai, T.-H.; Zhang, G.; Kasumov, T.

2026-06-29 biochemistry 10.64898/2026.06.26.734905 medRxiv

Top 0.3%

15.1%

Show abstract

Dysregulated histone acetylation links cellular metabolism to gene expression, but measuring its in vivo turnover remains technically challenging. Here, we introduce a 2H2O-based metabolic labeling method coupled with high-resolution Orbitrap mass spectrometry to quantify in vivo histone acetylation dynamics. The approach leverages differing deuterium incorporation rates between fast-labeling acetyl groups and slow-labeling peptide backbones. A two-tier analytical workflow uses full-scan mass spectrometry for mono-acetylated peptides, combined with parallel reaction monitoring (PRM) to resolve site-specific turnover and stoichiometry. Furthermore, monitoring acetyl-group plateau 2H enrichment enables the evaluation of specific substrate contributions to the acetyl-CoA pool supporting histone acetylation. To demonstrate biological utility, we applied this approach to mice maintained on a high-carbohydrate diet or subjected to 48-h fasting to assess nutrient-dependent histone acetylation dynamics. Acetyl-group labeling reflected the metabolic origin of acetyl-CoA, showing greater 2H enrichment in the fed state and reduced enrichment during fasting due to increased utilization of unlabeled fatty acid-derived acetyl-CoA. Fasting accelerated acetylation turnover across multiple histone sites and reduced overall acetylation stoichiometry. Quantitative tracing revealed that fatty acid oxidation becomes an important contributor to histone acetylation during fasting, whereas glucose remains the predominant source of nucleo-cytosolic acetyl-CoA (supplying > 60% of acetylation used carbon). This approach enables simultaneous in vivo assessment of histone acetylation turnover, site occupancy, and acetyl-CoA substrate utilization, offering a robust platform to investigate metabolic-epigenetic crosstalk in health and disease.